Querying Data Tutorial

MLDB comes with a powerful SQL-like Select Query implementation accessible via its REST API. This tutorial will show a few different ways to query data.

The notebook cells below use pymldb; you can check out the Using pymldb Tutorial for more details.


In [1]:
from pymldb import Connection
mldb = Connection()

Creating a sample dataset

First we will create a sample dataset, much like in the Loading Data Tutorial:


In [2]:
ex = mldb.put('/v1/datasets/example', {"type":"sparse.mutable"})
mldb.post('/v1/datasets/example/rows', { "rowName": "r1", "columns": [ ["a", 1, 0], ["b", 2, 0] ] })
mldb.post('/v1/datasets/example/rows', { "rowName": "r2", "columns": [ ["a", 3, 0], ["b", 4, 0] ] })
mldb.post('/v1/datasets/example/rows', { "rowName": "r3", "columns": [ ["a", 5, 0], ["b", 6, 0] ] })
mldb.post('/v1/datasets/example/rows', { "rowName": "r4", "columns": [ ["a", 7, 0], ["b", 8, 0] ] })
mldb.post('/v1/datasets/example/commit')


Out[2]:
POST http://localhost/v1/datasets/example/commit
200 OK

Querying into a DataFrame

We can use the query() shortcut function to run queries and get the results as Pandas DataFrames


In [3]:
df = mldb.query("select * from example")
print type(df)
df


<class 'pandas.core.frame.DataFrame'>
Out[3]:
a b
_rowName
r4 7 8
r3 5 6
r2 3 4
r1 1 2

Querying via REST

We can also make lower-level REST API calls to the query endpoint in the Query API, /v1/query for full SQL queries.


In [4]:
mldb.get('/v1/query', q="select * from example where a > 4", format="table")


Out[4]:
GET http://localhost/v1/query?q=select+%2A+from+example+where+a+%3E+4&format=table
200 OK
[
  [
    "_rowName", 
    "a", 
    "b"
  ], 
  [
    "r4", 
    7, 
    8
  ], 
  [
    "r3", 
    5, 
    6
  ]
]

We can control the format of the output JSON using the format attribute:


In [7]:
mldb.get('/v1/query', q="select * from example where a > 4", format="aos")


Out[7]:
GET http://localhost/v1/query?q=select+%2A+from+example+where+a+%3E+4&format=aos
200 OK
[
  {
    "_rowName": "r4", 
    "a": 7, 
    "b": 8
  }, 
  {
    "_rowName": "r3", 
    "a": 5, 
    "b": 6
  }
]

Where to next?

Check out the other Tutorials and Demos.


In [ ]: